Model Selection

Multimodal visual feature extraction

# Multimodal visual feature extraction

Internvit 6B 224px

InternViT-6B-224px is a foundational vision model focused on image feature extraction, with 5903 million parameters, supporting image inputs of 224x224 pixels.

Image Classification

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase